Prepared by: Andreas Mavrocordatos
Date: 20-06-2023
In this notebook, I explore the use of numerical approximations for second-order differential equations, using Taylor series expansion and central difference method. The problem is transformed into matrix form, implemented below, and the results are compared with the exact solution.
# Import libraries for data manipulation
import pandas as pd
import numpy as np
from numpy import *
# Import libraries for visualization
import plotly.graph_objs as go
import plotly.figure_factory as ff
import matplotlib.pyplot as plt
# Import libraries for statistical analysis
from scipy.linalg import solve
import warnings
warnings.filterwarnings('ignore')
The model problem is defined as: \begin{equation} \label{eq:model_problem} \frac{d^2y}{dx^2}+P(x)\frac{dy}{dx}+Q(x)y = f(x) \end{equation}
subject to the boundary conditions $y(a) = \alpha$ and $y(b) = \beta$.
Consider the regular partition of the interval $[a,b]$ as: $a = x_0 <x_1< x_2< \dots <x_{n-1} < x_n = b$ and define $\delta x = \frac{b-a}{n}$. In this project, I will implement a numerical method to solve the differential equation and test it solve the following problem: $$\frac{d^2y}{dx^2}+3\frac{dy}{dx}+2y = 4x^2, \quad y(1)=1, y(2)=6$$
Let: $$y_i = y(x_i), P_i = P(x_i), Q_i = Q(x_i), f_i = f(x_i)$$
We first show that the model problem defined in the introduction can be expressed as: \begin{equation} \label{eq:taylor_approx} \left(1+\frac{\delta x}{2}P_i\right)y_{i+1}+(-2+\delta x^2Q_i)y_i + \left( 1-\frac{\delta x}{2}P_i \right) y_{i-1} = \delta x^2 f_i,\quad i=1,\dots,n-1. \end{equation}
by using a Taylor series expansion and using a numerical approximations for each derivative terms:
Firstly, we must approximate the first-order derivative term $\displaystyle\frac{dy}{dx}$. To do so, we use the central difference method: $$\frac{dy}{dx} \approx \frac{y_{i+1}-y_{i-1}}{2\delta x}$$
Similarly, we can approximate the second-order derivative term $\displaystyle\frac{d^2y}{dx^2}$ using again the central difference method: $$\frac{d^2y}{dx^2} \approx \frac{y_{i+1}-2y_i+y_{i-1}}{\delta x^2}$$
Substituting these approximations into the model problem, we obtain: \begin{alignat}{2} \label{eq:numerical_model} f_i &= \dfrac{y_{i+1}-2y_i+y_{i-1}}{\delta x^2} + P_i \dfrac{y_{i+1}-y_{i-1}}{2\delta x} + Q_iy_i \nonumber \\ \Longleftrightarrow \quad f_i\delta x^2 &= y_{i+1}-2y_i+y_{i-1} + P_i(y_{i+1}-y_{i-1})\frac{\delta x}{2}+ Q_iy_i \delta x^2 \nonumber \\ &= y_{i+1} + y_{i+1} P_i \frac{\delta x}{2} - 2y_i + y_i Q_i\delta x^2 + y_{i-1} - y_{i-1} P_i \frac{\delta x}{2} \nonumber \\ &= y_{i+1} \left( 1 + \frac{\delta x}{2} P_i \right) + y_i \left( Q_i \delta x^2 - 2 \right) + y_{i-1} \left( 1 - P_i \frac{\delta x}{2} \right), \quad \forall i \in \{1, \dots, n-1\} \end{alignat}
In order to consider the matrix inversion problem, we can rewrite the taylor series in matrix notation. To do so, it is necessary to consider the boundary conditions $y_0 = y(a) = \alpha$ and $y_n = y(b) = \beta$. Consider the tridiagonal matrix $\mathbf{A}$ of size $(n+1) \times (n+1)$. The first line of this matrix is used to represent $y_0$, and the last line represents $y_n$. We may therefore define this matrix as:
Consider the vectors $\boldsymbol{y}$ and $\boldsymbol{b}$ as follows: \begin{equation*} \begin{array}{@{}c@{\;}c@{}} \mathbf{y} = \begin{pmatrix} y_0 \\ y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix} & \qquad \mathbf{b} = \begin{pmatrix} \alpha \\ f_1\delta x^2 \\ f_2\delta x^2 \\ \vdots \\ \beta \end{pmatrix} \end{array} \end{equation*}
Then $\mathbf{A} \mathbf{y} = \mathbf{b}$ can be denoted as: \begin{equation} \label{eq:matrix_model} \begin{pmatrix} 1 & 0 & 0 & \cdots & 0 \\ 1-\frac{\delta x}{2}P_1 & -2+\delta x^2Q_1 & 1+\frac{\delta x}{2}P_1 & \cdots & 0 \\ 0 & 1-\frac{\delta x}{2}P_2 & -2+\delta x^2Q_2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{pmatrix} \begin{pmatrix} y_0 \\ y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix} = \begin{pmatrix} \alpha \\ f_1\delta x^2 \\ f_2\delta x^2 \\ \vdots \\ \beta \end{pmatrix} \end{equation}
In this section, the exact solution of the differential equation:
$$\frac{d^2y}{dx^2}+3\frac{dy}{dx}+2y = 4x^2, \quad y(1)=1, y(2)=6$$
This problem is a second-order linear inhomogenous Ordinary Differential Equation (ODE) with defined boundary conditions. To solve this problem analytically, we will find a general solution $y(x) = y_c(x) + y_p(x)$, where $y_c(x)$ is the complementary function (associated with the homogeneous equation) and $y_p(x)$ is the particular solution.
Complementary Function
Firstly, we determine the complementary function $y_c(x)$ by setting the right-hand side of problem to zero:
The characteristic equation for the ODE can be written as: \begin{align} 0 &= \lambda^2 + 3 \lambda + 2 \nonumber \\ &= (\lambda + 2)(\lambda + 1) \nonumber \end{align} Therefore, two solutions exist where $\lambda_1 \neq \lambda_2 \in \mathbb{R}$. Therefore: \begin{align} \label{eq:complementary_sol} y_c(x) &= C_1 e^{(\lambda_1 x)}+ C_2 e^{(\lambda_2 x)}\nonumber \\ &= C_1 e^{(-x)}+ C_2 e^{(-2x)} \end{align}
Particular Function
We now determine the particular function $y_p(x)$. Since the right-hand of the problem is a quadratic function, we assume that the particular solution will take the form of: \begin{align} y_p(x) &= Ax^2 + Bx + C \nonumber \\ y'_p(x) &= 2Ax + B \nonumber \\ y''_p(x) &= 2A \nonumber \end{align}
Substituting $y_p(x)$, $y'_p(x)$ and $y''_p(x)$, we obtain:
Therefore, based on the coefficients of $4x^2$, we obtain: \begin{align} 4 &= 2A \nonumber \\ 0 &= 6A + B \nonumber \\ 0 &= 2A+3B+2C \nonumber \end{align}
The above equations are therefore solved to obtain: $A=2$, $B= -6$ and $C=7$. The particular solution is therefore: \begin{equation} \label{eq:particular_sol} y_p(x) = 2x^2 -6x + 7 \end{equation}
General Solution
Combining the complementary and particular solutions, we can construct the general solution:
We can now apply the boundary conditions $y(1)=1$ and $y(2)=6$ to obtain the following system of linear equations:
and $$ y(2) = C_1 e^{(-2)}+ C_2 e^{(-4)} + 8 -12 + 7 = 6 $$ \begin{equation} \label{eq:y2_boundary} \Longleftrightarrow \quad C_1 \frac{1}{e^2} + C_2 \frac{1}{e^4} = 3 \end{equation}
We can rewrite the first equation by solving for $C_1$ to obtain:
We can then substitute this result in the second formula obtained from $y(2)$ to obtain:
Rearranging the equation, we get: \begin{align} \label{eq:solve_C2} C_2\left( \frac{1}{e^2}-\frac{1}{e^3} \right ) &= 3 + \frac{2}{e} \nonumber \\ C_2\left( \frac{1-e}{e^4}\right ) &= \frac{3e+2}{e}\nonumber \\ C_2 &= \frac{3e^4 + 2e^3}{1-e} \end{align}
Finally, wby substitution, we obtain: \begin{align} \label{eq:solve_C1} C_1 &= -2e - \frac{1}{e}\times \frac{3e^4 + 2e^3}{1-e}\nonumber \\ C_1 &= -2e - \frac{3e^3 + 2e^2}{1-e} \end{align}
Therefore, the general solution of the second-order ODE is:
where: $$C_1 = -2e - \frac{3e^3 + 2e^2}{1-e}$$ and $$C_2 = \frac{3e^4 + 2e^3}{1-e}$$
In this section, the numerical approximation is implemented and the results are compared with the exact solution detailed in the previous section. It should be noted that the coefficients can easily be changed to test for different types of differential equations.
# Coefficient functions for the given differential equation
def P(x):
return 3
def Q(x):
return 2
def f(x):
return 4 * x**2
def solve_diff_eq(a, b, alpha, beta, n):
# Calculate delta_x based on the interval (mesh points) and number of partitions (n)
delta_x = (b - a) / n
# Create a partition of the interval [a, b] with n+1 points (due to boundary)
x = np.linspace(a, b, n+1)
# Initialize the matrices A and B for the linear system
A = np.zeros((n+1, n+1))
B = np.zeros(n+1)
# Set the boundary conditions in A and B
A[0, 0] = 1 # First row, first column
A[-1, -1] = 1 # Last row, last column
B[0] = alpha # First row of vector
B[-1] = beta # Last row of vector
# Fill the matrices A and B using the given coefficients
for i in range(1, n):
xi = x[i] # x_i as defined in the problem
# Diagonal elements of matrix A
A[i, i] = -2 + delta_x**2 * Q(xi)
# Off-diagonal elements of matrix A
A[i, i-1] = 1 - delta_x/2 * P(xi) # Element on the left of diagonal
A[i, i+1] = 1 + delta_x/2 * P(xi) # Element on the right of diagonal
# Right-hand side of the linear system (vector B)
B[i] = delta_x**2 * f(xi)
# Solve the linear system to obtain the values of y (python's scipy linalg library)
y = solve(A, B)
return x, y
# Initialize parameters based on values provided in the exam paper
a = 1
b = 2
alpha = 1
beta = 6
# Define the coefficients C1 and C2 as detailed in the Appendix of the report
e = np.exp(1)
C1 = -2 * e - (3 * e**3 + 2 * e**2) / (1 -e)
C2 = (3 * e**4 + 2 * e**3) / (1 -e)
# Define the function solving the ODE
def func(x):
return C1 * e ** (-x) + C2 * e ** (-2 * x) + 2 * x ** 2 - 6 * x + 7
# Create a range of x values for the function
x_values = np.linspace(1, 2, 1000) # Set the line space based on the boundary conditions
y_values = func(x_values)
fig = go.Figure()
# Add the new line to the plot
fig.add_trace(go.Scatter(x=x_values, y=y_values, mode='lines', name='Exact Function'))
# Customize the plot
fig.update_layout(xaxis_title="x",
yaxis_title="y",
showlegend=False)
# Display the plot
fig.show()
As a reminder, mesh points are used to discretize the continuous interval into a finite set of points. Specifically, based on the above definitions, the mesh points are defined as:
$$ x_1 = a+\delta x, x_2 = a +2\delta x, ..., x_{n-1} = a + (n-1)\delta x$$n = 10
# Call the function with desired parameters
x_values, y_values = solve_diff_eq(a, b, alpha, beta, n)
# Create a DataFrame with the results (x_i, y_i)
results_10 = pd.DataFrame({"x_i": x_values, "y_i": y_values})
# Create a Plotly line plot
fig = go.Figure(go.Scatter(x=results_10['x_i'], y=results_10['y_i'], mode='lines+markers'))
# Customize the plot
fig.update_layout(title="Solution of the Differential Equation for n = 10",
xaxis_title="x",
yaxis_title="y",
showlegend=False)
# Display the plot
fig.show()
n = 50
# Call the function with desired parameters
x_values, y_values = solve_diff_eq(a, b, alpha, beta, n)
# Create a DataFrame with the results (x_i, y_i)
results_50 = pd.DataFrame({"x_i": x_values, "y_i": y_values})
# Create a Plotly line plot
fig = go.Figure(go.Scatter(x=results_50['x_i'], y=results_50['y_i'], mode='lines+markers'))
# Customize the plot
fig.update_layout(title="Solution of the Differential Equation for n = 50",
xaxis_title="x",
yaxis_title="y",
showlegend=False)
# Display the plot
fig.show()
n = 100
# Call the function with desired parameters
x_values, y_values = solve_diff_eq(a, b, alpha, beta, n)
# Create a DataFrame with the results (x_i, y_i)
results_100 = pd.DataFrame({"x_i": x_values, "y_i": y_values})
# Create a Plotly line plot
fig = go.Figure(go.Scatter(x=results_100['x_i'], y=results_100['y_i'], mode='lines+markers'))
# Customize the plot
fig.update_layout(title="Solution of the Differential Equation for n = 100",
xaxis_title="x",
yaxis_title="y",
showlegend=False)
# Display the plot
fig.show()
# Create a range of x values for the function
x_values = np.linspace(1, 2, 1000) # Set the line space based on the boundary conditions
y_values = func(x_values)
fig = go.Figure()
# Add trace for exact solution
fig.add_trace(go.Scatter(x=x_values, y=y_values, mode='lines', name='Exact Function'))
# Add traces for n = 10, 50, 100
fig.add_trace(go.Scatter(x=results_10['x_i'], y=results_10['y_i'], mode='lines+markers', name="n = 10"))
fig.add_trace(go.Scatter(x=results_50['x_i'], y=results_50['y_i'], mode='lines+markers', name="n = 50"))
fig.add_trace(go.Scatter(x=results_100['x_i'], y=results_100['y_i'], mode='lines+markers', name="n = 100"))
# Customize the plot
fig.update_layout(title="Solution of the Differential Equation for Different n Values",
xaxis_title="x",
yaxis_title="y")
# Display the plot
fig.show()
It is clear from the graph above that the numerical solution converges to the analytical solution as $n$ increases. The rate of convergence can be estimated as $n$ increases. To do so, we will run the numerical approximation from $n=10$ to $n=1000$ with increments of $n=1$. For each value of $n$, let $x^n_i$ be the points selected based on the interval. Then, we define $y^n_i$ as the calculated $y_i$ for a given $n$. Finally, we define $y^*_i$ as the calculated $y_i$ based on the analytical solution. The difference between the numerical solution and the analytical solution for each $n$ can be calculated as the average of the absolute differences $|y^n_i - y^*_i|$ at all the points $x_i$ in the corresponding partition: $$\text{Average Difference} = \frac{1}{n}\sum_{i=1}^n |y^n_i - y^*_i|$$
# Create an empty dataframe to store the differences
df = pd.DataFrame(columns=['n', 'average_difference'])
# Loop through n values from 10 to 1000
for n in range(10, 1001):
# Run the numerical solution
x_numerical, y_numerical = solve_diff_eq(a, b, alpha, beta, n)
# Find the corresponding analytical solution
y_analytical = func(x_numerical)
# Calculate the absolute difference between numerical and analytical solutions
differences = np.abs(y_numerical - y_analytical)
# Take the average difference for the current n
average_difference = np.mean(differences)
# Append the results to the dataframe
df = df.append({'n': n, 'average_difference': average_difference}, ignore_index=True)
# Display the dataframe
df
| n | average_difference | |
|---|---|---|
| 0 | 10.0 | 0.009783 |
| 1 | 11.0 | 0.008154 |
| 2 | 12.0 | 0.006900 |
| 3 | 13.0 | 0.005915 |
| 4 | 14.0 | 0.005126 |
| ... | ... | ... |
| 986 | 996.0 | 0.000001 |
| 987 | 997.0 | 0.000001 |
| 988 | 998.0 | 0.000001 |
| 989 | 999.0 | 0.000001 |
| 990 | 1000.0 | 0.000001 |
991 rows × 2 columns
# Create a scatter plot of the average difference versus n
fig = go.Figure(
go.Scatter(x=df['n'], y=df['average_difference'], mode='lines', name='Average Difference')
)
# Update layout
fig.update_layout(
title='Average Difference between Numerical and Analytical Solutions vs. n',
xaxis_title='n',
yaxis_title='Average Difference (log scale)',
yaxis=dict(type='log', showgrid=True, gridwidth=0.5, zeroline=True, zerolinewidth=2)
)
# Show the plot
fig.show()
The log scale plot was used for a better visualization of the behaviour of the average difference as $n$ increases. Using the log scale improves the visualization of the decrease in difference due to exponential improvement of the precision (quick change in the order of magnitude). The plot confirms that the numerical approximation improves significantly as the number of interior mesh points increases.
The workings of this exploration have shown that the numerical method to solve certain differential equation problems is an effective approach due to its rapid convergence towards the analytical solution as the number of interior mesh points increases. This allows for an accurate (approximated) solution even when the exact solution is not readily available.
It is noted that, in the context of this exploration, only a single differential equation problem was solved. As such, the numerical method was not tested for more complex differential equations and could be further investigated in the context of future analysis, such as testing the methodology for differential equations with variable coefficients, or for higher order differential equations. It is expected that the computational complexity increases as the number of auxiliary variables increases, the complexity of the matrix operations increase, or the complexity of boundary conditions increase.
Areas of improvement to the implemented method could include testing different sampling methodologies than the central difference used for this exploration such as the finite difference method of higher order: instead of using the basic central difference scheme, a higher-order finite difference can be used, using additional mesh points to achieve better accuracy. Another example of a potential improvement that could be tested in the context of a more in-depth exploration is investigating alternative techniques to dynamically refine the mesh in regions where the solution is not smooth, potentially leading to more accurate solutions with fewer mesh points. An example of a technique that could be investigated in this context is the Adaptive Mesh Refinement (AMR) approach detailed by Berger and Oliger (see paper), which refines the mesh locally, using a hierarchical approach, in regions where the solution exhibits rapid changes or high gradients.